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ABSTRACT 

We construct an adaptive asymptotically optimal in the classical norm of the 
space L(2) of square integrable functions non - parametrical multidimensional time 
defined signal regaining (adaptive filtration, noise canceller) on the background noise 
via multidimensional truncated Legendre expansion and optimal experience design. 

The two - dimensional case is known as a picture processing, picture analysis 
or image processing. 

We offer a two version of an confidence region building, also adaptive. 

Our estimates proposed by us have successfully passed experimental tests on 
problem by simulate of modeled with the use of pseudo-random numbers as well 
as on real data (of seismic signals etc.) for which our estimations of the different 
signals were compared with classical estimates obtained by the kernel or wavelets 
estimations method. The precision of proposed here estimations is better. 

Our adaptive truncation may be used also for the signal and image compression. 
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1. Statement of problem. Let V(n),n = 16, 17, ... be a sequence of a vector 
- valued sets (plans of experiences) in the cube [—1, l] d , d — 2,3, ... : 

V(n) = {xi = Xi = Xi(n), }, x { G [-1, l] d . 

At the points Xj we observe the unknown signal / = f(x), x G [— 1, l] d on the 
background noise: 

y(i) + (!) 

where the noise errors of measurements, is the sequence of independent (or 

weakly dependent) centered: E£j = normed: Var(£j) = 1 random variables, 
a = const > is standard deviation of errors. 
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Our aim is elaboration of an adaptive asymptotically as n — > oo optimal 
in the L(2) sense signal / retaining f n = f n (x) : A 2 (n) d = 

E||/„(-) - /(-)ll 2 = E / |/ B (x) - /(x)| 2 rfx - min; 

•/[-l,l] d /" 

/n = fn(x] V(n), {y(i)}) is some measurement, or, in other words, estimation 
of a signal / = f(x). 

We consider in this report only multidimensional case d>2. The one - 
dimensional case is consider in [3] . We notice that there are some essential differences 
between one - dimensional and multidimensional cases; we will show, for example, 
that in the multidimensional case we need to use only optimal experience design. 

The multidimensional case d > 2 imply that our signal, more exactly, the func- 
tion on x, is not necessary to be temporal. 

The adaptiveness means that our estimations do not use any apriory informa- 
tion about the estimated function /, for example, information on the its class of 
smoothness. 

On the other words, this problem is called "filtration of a signal on the back- 
ground phone", "adaptive noise canceller" or "regression problem". 

In the one - dimensional case d — 1 this problem was considered in many 
publications ([1] - [5] etc). The case d = 2 is known as "picture processing" or 
equally "image processing". 

2. Denotations. Assumptions. Construction of our retaining. Let 

z = z = {zj},j — 1, 2, . . . , d, Zj G [—1, 1] be a d — dimensional vector, 

F(z) = 2- d f[(l + z J ), 6(n)=6(n,V(n)) = 

3=1 

n 

sup\G n (z) - F(z)\, G n (z) =n~ 1 Y / H x i< z), 

i=i 

where 

I(x < z) — 1 Vj = 1, 2, . . . , d =^> Xj < Zj, 

and I(x < z) = in other case. 

The value, more exactly, the function S — 5(n) — 5(n, V(n)) is called discrepancy 
of a sequence plans V(n). 

We suppose that 

5(n)<C(l,d)[\og(n)} d /n, (2) 

Note that in the one - dimensional case the condition (2) is satisfied even 
without the member log (n) if x^ = —l + 2i/n (the uniform plan); but in general 
case d > 2 we need to use, e.g., the Niederreiters sequences (experience design)(see 
[6], p. 183 - 202), for which the condition (2) is satisfied. 

It is proved also in [6], p. 251 - 276 that for arbitrary sequences of plans 
V — V(n) its discrepancy satisfies the inequality 
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S(n) > C(2,d) (log n) d - l /n. 

Therefore, the Niederreiters sequences are quasi - optimal in the sense of minimal 
asymptotical as n — > oo behavior of discrepancy 5(n). 

In comparison, for the uniform d — dimensional plan 5(n) x C(3) n^ 1 / d . 

It is well known that for the so - called random experience design, i.e. if the 
vectors {x*i} are random variables with the uniform distribution in the cube [—1, l] d , 

S(n) > C(3,d) (log log n) 1 / 2 /^, 

where C(3, d) are the random constants. 

Therefore, the uniform plans and the random plans are not asymptotically opti- 
mal. 

Note in addition that the Niederreiters sequences allow us to elaborate the se- 
quential estimation of signal f(x). 

Further, we assume that for some q,Q G (0, oo) 

\/ u > P(|&| > u) < exp (-(u/Q) q ) . (3) 

The condition (3) is satisfied, e.g., if the errors of measurements have the 
Gaussian distribution; in this case q — 2. 

The consistent as n — > oo measurement (estimation) Q(n),q(n) and 7(71) of the 
parameters Q,q is described correspondently in [1], [2]. 

Further, let us denote by L m (x) the usually normed Legendres polynomial on 
the set [-1,1]. The Legendre polynomials P m (x) are given by the well - known 
Rodrigues formula 

1 d m 

p m ( x ) = — 7^\{x 2 - in 

or, more conveniently for computation, by means of recurrent relation and initial 
conditions: Pq{x) — 1, Pi(x) — x, m > 1 

(m + l)P m+ i(x) = (2m + 1) x P m (x) - m P m _ i(x) 

with orthogonal property: 

I(k, m) = f ^ ^ P rn {x)P k {x)dx = 2/ (2m + 1), m — k, 
otherwise I(k,m) = 0. We can define L k (x) = P k (x)y/k + 0.5 and for the multidi- 

— * 

mensional index k = k = (k(l), k(2), . . . , k(d)), k(j) — 0, 1, . . . , d 

d 

(f)(k,z) = Y[L k(j) (z{j)), z = {z(j), j = l,2,...,d}. 
3=1 

We denote v = 2 l ' d and for N e (1, N d (n)) 
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R(N) = {k : maxk(j) < N}, W{N) = R([u N}) \ R(N). 

3 

Hereafter [z] will denote the integer part of (positive) variable z. 

Since the function (signal) / = f(x) is presumed to be square integrable: / G 
1/(2), it may be expanded in the L(2) sense as follows: 

f(z) = £c(fc) <P(k,z), p(N) d = f £ i^)Y - 0, N -> oo. 

k k£R(N) 

We suppose (condition 7 ) that there exists a limit less than 0.5 : 

7 = f lim p([vN])/p(N)< 1/2, (4) 

TV -+00 

and will write / G if (7). In the case when 7 = we will write / G K (0). 
The condition 7 is satisfied if, e.g., as iV — > 00 

p(iV) ~ C(5)N-f>S(N), (3>d, (5), 

where 5 = S'(iV) is slowly varying as iV — * 00 function; the condition / G if (0) is 
satisfied if, e.g., as N — * 00 

p(iV) ~ C(6) a N , a = const G (0, 1). (6) 

The values p(N) = p(f, N) are known and well studied in the approximation 
theory. Namely, p(f,N) = E%(f), where E N (f) is the error of the best approxi- 
mation of / by the algebraic polynomials of each power not exceeding N in the L(2) 
distance and are closely connected with module of continuity of the form 

<(/,*) = sup ||A^/||, AJk/(s)^ 

\h\<t 

m 

J2( - f(x + (0.5m - I) hil>(x))/(U (m - /)!), 

1=0 

#r) = (l - x 2 ) - 5 , f(x + y) = f(mm(x + y,l)),y>0; f(x + y) = 

f(max(x + y),-l) if y < 0; m = 0, 1, 2, . . . ; h = h = (h(l), h(2), . . . , h(d)); \h\ = 
ma.Xj \h(j)\. 

For instance, see ([7]), p(f, N) x N~ 2m if and only if 

u#(/,t)x t m |log(t)|°- 5 , t€ (0,0.5]. 

Remark. The condition j3 > d or more general assumption 7 < 0.5 is necessary 
still in the case d = l ([1], [2]). 

We can estimate the coefficients c(k) as follows: 

n 

c(n, k) = n~ l J2 y(i) </>$, 
i=i 
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Let us define N d (n) = [n 1 '^ (\ogn)- 2d ^ d+1 ^] , t(N) = r(N,n) = 
]T [c(n, k)}\ N(n) = argmin{r(iV, n), N < N d (n)}, 

keW(N) 

fn = fn(x)= c(n,k) (f)(k,x). (7) 

keR(N(n)) 

The function f n = f n (x) represented our adaptive measurement of an unknown signal 
f = f(x). It may be proved that our signal measurement f n is optimal in order as 
n >> 1 in L(2) norm under conditions (3) and (5) in the minimax sense. 

3. Properties of our estimation. Main result. We can obtain after hard 
calculations alike to [2], [3] that as n — > oo 

E||/ B - /|| 2 ~min(p(7V).[l _ 7 ] + a 2 N d /n) ; 

therefore in the case if 7 = our estimation /„(•) is asymptotical optimal in the 
L(2) sense. 

In the case if 7 G (0, 0.5) we can modify our estimation (7) in order to obtain 
optimal measurement of /(•) as follows. Instead the functional r we introduce its 
so - called penalty modification: 

9(N) = t(N) - 7(71) a 2 (n) N/n (8) 

and define as a modified, asymptotically optimal in L{2) sense estimation for the 
function / the function g n = 

9n(x) = g n (x;V(n),{yi}) = ^ c(n,h) <f>(k,x), (9) 

keR{M(n)) 

M(n) = argmin{6»(iV,n), N < N d (n)}. 

Here 7(n), <r 2 (n), q(n), Q(n) etc. are correspondently consistent estimations of 
parameters 7, a 2 , q, Q estimation, described in [1], [2]. 
For instance, 

n 

° 2 (n) = J2 [fn (Si) ■ - y(i)f I (n - N d + 1). (10) 
i=i 

4. Confidence region (c.r.). We want build in this section the c.r. for /(•) 
in the L(2) sense. As a first approximation we can offer the following approach. 
With probability tending to one as n — > 00 the following inequality holds: 

Wfn - f\\ 2 <Q\n) r(iV(n))/(l-7H). 
For the more exact c.r. building we proved that ||/ n — f\\ 2 < 

Q 2 (n) r(N(n))/(l - 7 (n)) x [l + C( 7 ) C (loglogn^/n 
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r = 2dq(n)/(q(n) + Ad), q(n) e (0, 2); r = dq(n)/(q(n) + d), q{n) > 2, 
where the non - negative random variable ( is such that for all positive values 
u > 



sup P (C > u) < exp (~u r/2 ) (11) 



(exponential confidence region). 

5. Optimal adaptive denoising in other norms. We can consider instead 
the L{2) norm also some more strong norms, (in order to improve the sensitivity 
of our method,) for example, L(p) norm or the uniform norm L(oo) in the space of 
continuous functions C[ — 1, 1] etc.: 

r , iVp 

<fe/ 



A p (/ ln ,/) = ||/ ln - f\\ p = J E 



( \h n {x) - f(x)\ p dx 



A OD (h n ,f)=E sup \h n (x) - f(x)\= lim A p (h n ,f), 
xe[ - i,i] d p ^°° 

where h n (-) is some estimation (measurement) of signal /(•). 

But for consistent and optimal measurement in these spaces we need to use the 
so - called Vallee - Poissin improvement of g n (')- Namely, let us denote 

\k\ = \k\ = max "n^oo = ^oo(^) = N[n/log(n] 

j=l,2,...,d 

in the case p = oo and 

m, p = m p (n) = N(n) 

in the case p < oo. 

We define the Vallee - Poissin modified coefficients 

d(k,n) = d(k,n) = d p (k,n) = c(k,n), \k\ < m p ; 



d p (k,n) — c(k,n)(vN(n) — \k\)/(vN(n) — m p (n)),\k\ G [m p (n),vN]. 

As the estimation h n (-) = h^(-) = h^(-) of a signal /(•) we offer the following 
improvement of the estimation g n : 

h^(x)= J2 d p {n,k) <f>{k,x). 

keR(u N(n)) 

This estimation h n (-) = h^(-) of a signal / is optimal in order as n — > oo in 
each space L(p), p e (2, oo] norms. 

For the simple building of confidence region in the L(p) norms we proved also 
that as with probability tending to one as n — > oo 

||^ p) - f\\ p <C 7 (p,q(n) n (n))Q(n)r(m p (n))/(l - 7 (n)), p < oo, 
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and 

I Mr - f Woo <C 8 (q(n) }1 (n)) r( moD (n)) Q(n)/(1 - 7 (n)). 

6. Proofs. Notice that the complete mathematical proof of our assertions 
used the modern martingale theory, for instance the exponential bounds for tails of 
distribution in the Law of Iterated Logarithm (LIL) for martingales, as in the one - 
dimensional case considered in [3]; theory approximation [12] and theory of Banach 
spaces of random variables [13] etc. 

Our proof is alike to the proofs in one - dimensional case [3]; we must explain 
only briefly some new essential moments. 

A. Let us denote 

A(n, N) = p(N) + a 2 N d /n, A{n) = min A(n, N); 

AT— 1,2,... 

B(n,N)= ]T (c{k)) 2 + a 2 N d /n~ p (N) (1 - 7) + a 2 N d /n; 

keW(N) 

Bin) = min Bin, N) = min B(n,N); 

y ' N v ' N<N d (n) y 1 

N° = N°(n) = argmin_B(n, N) = argmin B(n, N). 

N=l,2,- N<N d (n) 

It follows from the condition (7) that as n — > 00 

A(n, N) x B{n, N), A(n) x Bin), 
and, by virtue of condition (7) 

iV° x argmin A(n, N) x argmin A{n, N). 

N=l,2,... N<N d (n) 

The value (A(n)) 1 / 2 is asymptotical optimal in L(2) sense as n — > 00 speed of 

convergence of an arbitrary, i.e. not necessary to be adaptive, estimations of the 
function /(•) [14]. 

B. We can write further: 

c(k, n) ~ c(k) + n -1 / 2 ^(n) + n(k, n), 
where the deterministic variables 

n 

rj(k,n) = n~ 1 J2f(%i) <K^) - c ( k ) 
i=i 

are errors of Fourier - Legendre coefficients {c{k)} numerical computing by means 
of plan (set) V(n) with equal weights. 

We obtain after the d — times integration by parts using the known properties 
of Legendres polynomials and the condition (7) : 
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V(k,n) = I f{x) <f>(k,x) d(G n (x) - F(x)); 
J[ - i,i] d 

\rj(k : n)\<C( 1: d) sup\G n (x) - F(x)\ (|Af + l)) = 

C( 7 ,d) 5(n,V(n)) (\k\ d + 1); 
K= E \v(k,n)\ 2 <C( 1} d) \o£\n)N^n-\ 

k£R(u N) 

Since the value N(n) belong to the segment (1, Nd(n)), we conclude after simple 
computations that the sum £ 2 not exceeded the value C N d /n < B(n, N) ~ r(n, N). 
C. We have: 

n 

Q k (n) = a n~ 1/2 E&0fc(^)- 
i=i 

It follows from the multidimensional CLT that the variables {0 k (n)} for all the 
values k = k as n — > oo are asymptotically Gaussian distributed and independent: 

n . 

Var[0 k (n)} = n' 1 V a 2 L 2 k { Xi ) -> a 2 / cfa; = a 2 ; 

i=i " -A - Li]" 

~E9 k (n)0i(n) = a 2 n~ x E 4>h{xi)4>i{x%) -> o" 2 / (f> k (x)(f)i(x) dx = 0, fc^L 

i=i J [ - 1 ' 1 l d 

Following, the variables are asymptotically independent and have approxi- 

mately the normal distribution: 

Law( c(k,n) ) x N (c(k) , a 2 / n) , 

or equally 

c(k, n) = c(k) + ae k /^/n, Law(e k ) x AT(0, 1) 
and also {e k } are asymptotically independent. Therefore, r(n,N) x 

E |c(A;)| 2 + 2^ 1 / 2 ( t £ CfcCfc + ^rT 1 £ = 

£ |c(A;)| 2 + 2n- 1 / 2 ( 7 E c k e k + a'n' 1 ^ + a 2 n' 1 E (4-1); 

fceiy(TV) keW(N) keW(N) 

Er(n, iV) x S(n, AT), Var[r(n, AT)] x S(n, JV)/n, 

and hence 



AT -> oo, N/n -> =>• JVar[r(n, iV)]/Er(n, A/") -> 0. 
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Note that the conditions 7 < 1/2 and (2) was used and is essential which is common 
in statistical research. 

D. It follows from our considerations that there are some grounds to conclude 



thus, 



r(n, N) x Er(n, N) x A(n, N), 



N(n) = argmin r{n,N) ~ argmin Er(n, N) = N°(n) 

JV<[nVd/3] AT<[nVd/3] 



Since the our adaptive value (random!) of amount summands N(n) is near to the 
optimal value N°(n), (not adaptive,) our estimation (measurement) is also optimal. 

More exactly, we can write as a first approximation without the members 
{r](k, n)} calculations: c(k,n) = 



v n 1 J2&(j)(k,x i ); (c(k,n)) 2 = c 2 (k) + 
%=i 

n n 

a 2 n" 2 E 4> 2 {k, Xi ) + 2a n' 1 E c(k) & <p(k, Xi )+ 
i=i i=i 

n 



i=i 



2a 2 n 2 J2 E & <f>(k,Xi) (f)(k,Xj). 

l<i<j<n 

We have for the variables r(n, N) (and further for the variables A 2 = A 2 (n, N) = 
\\f-f\\ 2 ):r(n,N) = 



E c 2 k + a 2 n~ l J2 n 'E^Mi) 

keW(N) k€W{N) i=l 
n 

2an~ 1 J2£i E c(k) </>(k,Xi)+T 2 , 

i=l keW(N) 



+ 



r 2 = o 



n 



^Efc 2 - 1) E <t> 2 (k,x t 

keW(N) 



i=i 



+ 



2nl E E E (f>(k,Xi) (f>(k,Xj) 

l<i<j<n keW(N) 

Note that the sequences of a view 771(71) = Yh=i &(«) 

n 

(£ - 1) 

i=i 



and 



^sH = E E Hhti&Zj, 

l<i<j<n 
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where {b(i)},{b(i,j)} are a non-random sequences, with the second component 
F(n) = <7({&},i = l,2,...,ra), i.e. { Va (n),F(n)},s = 1,2,3; {F(n), n = 
1,2,3,...} is the natural sequence (flow) of sigma-algebras (filtration), are mar- 
tingales. 

Using the main result of paper [15], devoted to the Law of Iterated Logarithm 
for martingales, and repeating the considerations of the article [3] about the one - 
dimensional case, we obtain desired. 

7. An example. Suppose for some constants (3 > 0.5, K e (0, oo) as N — > oo 

p(N) ~K d+2(3 N~ 2p /(2/3). 
Then we have for the estimation g n (-) as n — > oo : E| \g n — f\\ 2 ~ 



K d n -2/3/(2/3+d) a AP/{d+2P) x d 2/3/(2/3+d) 



1 1 

2(3 + d 



;i2) 



Thus, the rate of convergence g n — > / in the 1/(2) sense is optimal ([2]). 

Note that by construction of our estimations we do not use the (unknown, as 
usually) parameters K, (3 (adaptiveness). 

Notice in conclusion that the estimates proposed by us have successfully passed 
experimental tests on problem by simulate of modeled with the use of pseudo-random 
numbers as well as on real data (of seismic signals etc.) for which our estimations 
of the different signals / were compared with classical estimates obtained by the 
kernel or wavelets estimations method. The precision of proposed here estimations 
is better. 

8. The computation complexities. The amount AM(n) of an elementary 
operation and square roots calculations of offered algorithm, if we will use the so - 
called Fast Legendre Transform (FLT) [8] is equal to 

AM(n) x (C(d) n \og 2 n) d . 

Recall (see [9]) that the amount of these operations by using the classical Fast Fourier 
Transform (FFT), even in the d — dimensional case is equal to C(d) n log 2 n. 

The advantage of our estimations in comparison to the trigonometric estimations 
[2] is especially in the case when the estimating function /(•) is not periodical: 
/(-1,-1,... ,-1)^/(1,1,. ..,1). 

9. Detection of signal. We can to use our adaptive c.r. for construction a test 
for presence (detection) of a signal. Namely, let us consider the following statement 
of hypothesis verification problem: H = {/ = 0} (the absence of signal) versus 
alternative H± — {/ ^ 0} (the presence of the signal). 

As long as the hypothesis H may be reformulated as H = {||/|| 2 = 0} and the 
counterhypothesis has a view Hi = {||/|| 2 > 0}, we can offer the following test. 

Let 5, 5 E (0,1/3) be some small number, for example, 0.05 or 0.01 etc., such 
that the value 5 is allowed level of a first kind: 

P(ifi/if„) < 5- (13) 
Our test (ft may be defined as follows: = 1 if and only if 
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n/nii 2 >m 

and = in other case. 

Here </>(•) denotes the number of our solution: we conclude Hi in the case if 
0=1 and H in other case. 

Here the value K(5) = K n {5) may be computed from (11) and (13), on the basis 
of equality 

P (||/ n || 2 >X(<5)) 

The notation P (A), A is an arbitrary event, denotes as usually the probability of 
A calculated under assumption of absence of signal /. 
In detail 



P 



Q\n) T(N(n)) 
1 - 7 (n) 



1 + C( 7 ) C 



(log logn) 2 / r 



We find, solving the last equality relative K n (S); 

(14) 



w Q\n) rjNjn)) / + [logg (loglogn)^ 

1 - 7 (n) V n 



The advantage of offered here test versus, e.g., the tests described in [10], [11] 
etc. is following. Our procedure is non - parametrical and adaptive, but is still 
consistent and asymptotically optimal in the L(2) sense. 
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